AITopics | permutation recovery

Collaborating Authors

permutation recovery

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Permutation recovery of spikes in noisy high-dimensional tensor estimation

Arous, Gérard Ben, Gerbelot, Cédric, Piccolo, Vanessa

arXiv.org Machine LearningDec-20-2024

We study the dynamics of gradient flow in high dimensions for the multi-spiked tensor problem, where the goal is to estimate $r$ unknown signal vectors (spikes) from noisy Gaussian tensor observations. Specifically, we analyze the maximum likelihood estimation procedure, which involves optimizing a highly nonconvex random function. We determine the sample complexity required for gradient flow to efficiently recover all spikes, without imposing any assumptions on the separation of the signal-to-noise ratios (SNRs). More precisely, our results provide the sample complexity required to guarantee recovery of the spikes up to a permutation. Our work builds on our companion paper [Ben Arous, Gerbelot, Piccolo 2024], which studies Langevin dynamics and determines the sample complexity and separation conditions for the SNRs necessary for ensuring exact recovery of the spikes (where the recovered permutation matches the identity). During the recovery process, the correlations between the estimators and the hidden vectors increase in a sequential manner. The order in which these correlations become significant depends on their initial values and the corresponding SNRs, which ultimately determines the permutation of the recovered spikes.

artificial intelligence, machine learning, recovery, (17 more...)

arXiv.org Machine Learning

2412.1465

Country:

Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.25)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

Add feedback

The Phase Transition Phenomenon of Shuffled Regression

Zhang, Hang, Li, Ping

arXiv.org Machine LearningOct-31-2023

We study the phase transition phenomenon inherent in the shuffled (permuted) regression problem, which has found numerous applications in databases, privacy, data analysis, etc. In this study, we aim to precisely identify the locations of the phase transition points by leveraging techniques from message passing (MP). In our analysis, we first transform the permutation recovery problem into a probabilistic graphical model. We then leverage the analytical tools rooted in the message passing (MP) algorithm and derive an equation to track the convergence of the MP algorithm. By linking this equation to the branching random walk process, we are able to characterize the impact of the signal-to-noise-ratio ($\snr$) on the permutation recovery. Depending on whether the signal is given or not, we separately investigate the oracle case and the non-oracle case. The bottleneck in identifying the phase transition regimes lies in deriving closed-form formulas for the corresponding critical points, but only in rare scenarios can one obtain such precise expressions. To tackle this technical challenge, this study proposes the Gaussian approximation method, which allows us to obtain the closed-form formulas in almost all scenarios. In the oracle case, our method can fairly accurately predict the phase transition $\snr$. In the non-oracle case, our algorithm can predict the maximum allowed number of permuted rows and uncover its dependency on the sample number.

artificial intelligence, machine learning, trpmq, (16 more...)

arXiv.org Machine Learning

2310.20438

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Washington > King County > Bellevue (0.04)
North America > United States > Massachusetts (0.04)
(4 more...)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback

Optimal Estimator for Linear Regression with Shuffled Labels

Zhang, Hang, Li, Ping

arXiv.org Machine LearningOct-2-2023

This paper considers the task of linear regression with shuffled labels, i.e., $\mathbf Y = \mathbf \Pi \mathbf X \mathbf B + \mathbf W$, where $\mathbf Y \in \mathbb R^{n\times m}, \mathbf Pi \in \mathbb R^{n\times n}, \mathbf X\in \mathbb R^{n\times p}, \mathbf B \in \mathbb R^{p\times m}$, and $\mathbf W\in \mathbb R^{n\times m}$, respectively, represent the sensing results, (unknown or missing) corresponding information, sensing matrix, signal of interest, and additive sensing noise. Given the observation $\mathbf Y$ and sensing matrix $\mathbf X$, we propose a one-step estimator to reconstruct $(\mathbf \Pi, \mathbf B)$. From the computational perspective, our estimator's complexity is $O(n^3 + np^2m)$, which is no greater than the maximum complexity of a linear assignment algorithm (e.g., $O(n^3)$) and a least square algorithm (e.g., $O(np^2 m)$). From the statistical perspective, we divide the minimum $snr$ requirement into four regimes, e.g., unknown, hard, medium, and easy regimes; and present sufficient conditions for the correct permutation recovery under each regime: $(i)$ $snr \geq \Omega(1)$ in the easy regime; $(ii)$ $snr \geq \Omega(\log n)$ in the medium regime; and $(iii)$ $snr \geq \Omega((\log n)^{c_0}\cdot n^{{c_1}/{srank(\mathbf B)}})$ in the hard regime ($c_0, c_1$ are some positive constants and $srank(\mathbf B)$ denotes the stable rank of $\mathbf B$). In the end, we also provide numerical experiments to confirm the above claims.

artificial intelligence, machine learning, piq, (16 more...)

arXiv.org Machine Learning

2310.01326

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > United States > Washington > King County > Bellevue (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Government (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Add feedback

Sparse Recovery with Shuffled Labels: Statistical Limits and Practical Estimators

Zhang, Hang, Li, Ping

arXiv.org Machine LearningMar-20-2023

This paper considers the sparse recovery with shuffled labels, i.e., $\by = \bPitrue \bX \bbetatrue + \bw$, where $\by \in \RR^n$, $\bPi\in \RR^{n\times n}$, $\bX\in \RR^{n\times p}$, $\bbetatrue\in \RR^p$, $\bw \in \RR^n$ denote the sensing result, the unknown permutation matrix, the design matrix, the sparse signal, and the additive noise, respectively. Our goal is to reconstruct both the permutation matrix $\bPitrue$ and the sparse signal $\bbetatrue$. We investigate this problem from both the statistical and computational aspects. From the statistical aspect, we first establish the minimax lower bounds on the sample number $n$ and the \emph{signal-to-noise ratio} ($\snr$) for the correct recovery of permutation matrix $\bPitrue$ and the support set $\supp(\bbetatrue)$, to be more specific, $n \gtrsim k\log p$ and $\log\snr \gtrsim \log n + \frac{k\log p}{n}$. Then, we confirm the tightness of these minimax lower bounds by presenting an exhaustive-search based estimator whose performance matches the lower bounds thereof up to some multiplicative constants. From the computational aspect, we impose a parsimonious assumption on the number of permuted rows and propose a computationally-efficient estimator accordingly. Moreover, we show that our proposed estimator can obtain the ground-truth $(\bPitrue, \supp(\bbetatrue))$ under mild conditions. Furthermore, we provide numerical experiments to corroborate our claims.

artificial intelligence, estimator, supp, (16 more...)

arXiv.org Machine Learning

2303.11233

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(6 more...)

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

Permuted and Unlinked Monotone Regression in $\mathbb{R}^d$: an approach based on mixture modeling and optimal transport

Slawski, Martin, Sen, Bodhisattva

arXiv.org Machine LearningJan-10-2022

Suppose that we have a regression problem with response variable Y in $\mathbb{R}^d$ and predictor X in $\mathbb{R}^d$, for $d \geq 1$. In permuted or unlinked regression we have access to separate unordered data on X and Y, as opposed to data on (X,Y)-pairs in usual regression. So far in the literature the case $d=1$ has received attention, see e.g., the recent papers by Rigollet and Weed [Information & Inference, 8, 619--717] and Balabdaoui et al. [J. Mach. Learn. Res., 22(172), 1--60]. In this paper, we consider the general multivariate setting with $d \geq 1$. We show that the notion of cyclical monotonicity of the regression function is sufficient for identification and estimation in the permuted/unlinked regression model. We study permutation recovery in the permuted regression setting and develop a computationally efficient and easy-to-use algorithm for denoising based on the Kiefer-Wolfowitz [Ann. Math. Statist., 27, 887--906] nonparametric maximum likelihood estimator and techniques from the theory of optimal transport. We provide explicit upper bounds on the associated mean squared denoising error for Gaussian noise. As in previous work on the case $d = 1$, the permuted/unlinked setting involves slow (logarithmic) rates of convergence rooting in the underlying deconvolution problem. Numerical studies corroborate our theoretical analysis and show that the proposed approach performs at least on par with the methods in the aforementioned prior work in the case $d = 1$ while achieving substantial reductions in terms of computational complexity.

denote, regression, statistics, (15 more...)

arXiv.org Machine Learning

2201.03528

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Virginia > Fairfax County > Fairfax (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Linear Regression with an Unknown Permutation: Statistical and Computational Limits

Pananjady, Ashwin, Wainwright, Martin J., Courtade, Thomas A.

arXiv.org Machine LearningAug-9-2016

Consider a noisy linear observation model with an unknown permutation, based on observing $y = \Pi^* A x^* + w$, where $x^* \in \mathbb{R}^d$ is an unknown vector, $\Pi^*$ is an unknown $n \times n$ permutation matrix, and $w \in \mathbb{R}^n$ is additive Gaussian noise. We analyze the problem of permutation recovery in a random design setting in which the entries of the matrix $A$ are drawn i.i.d. from a standard Gaussian distribution, and establish sharp conditions on the SNR, sample size $n$, and dimension $d$ under which $\Pi^*$ is exactly and approximately recoverable. On the computational front, we show that the maximum likelihood estimate of $\Pi^*$ is NP-hard to compute, while also providing a polynomial time algorithm when $d =1$.

artificial intelligence, machine learning, permutation recovery, (19 more...)

arXiv.org Machine Learning

1608.02902

Country: North America > United States (0.67)

Genre: Research Report (0.82)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)

Add feedback